Linguistic, Philosophical, and Pragmatic Aspects of Type-Directed Natural Language
نویسندگان
چکیده
We describe how type information can be used to infer grammatical structure. This is in contrast to conventional type inference in programming languages where the roles are reversed, structure determining type. Our work is based on Applicative Universal Grammar (AUG), a linguistic theory that views the formation of phrase in a form that is analogous to function application in a programming language. We descibe our overall methodology including its linguistic and philosophical underpinnings. The natural language parser that we have constructed should be interesting to computer scientists in the way in which AUG relates to types and combinatory calculus, and to linguists in the way in which a very simple, brute force parsing strategy performs surprisingly well in both performance and accuracy. Both computer scientists and linguists should also be interested in the convergence of the theory of functional programming languages and AUG with regard to their conceptual bases. Both have been motivated by entirely different goals and have developed independently, but they are rooted in a common conceptual system of an applicative calculus. 1 Functional Programming Languages and Applicative Universal Grammar: a Common Paradigm The central goal of a theoretical study of linguistics is to reveal how a human being uses languages to express and communicate thought. The philosophy of language is concerned with evaluating how particular theories of language succeed in attaining this central goal. Finally, the pragmatics of language concerns itself with the ability to implement a theory, for example in the form of a natural language processing system. Linguists and computer scientists benefit from exchanges concerning theory, philosophy, and pragmatics. In this paper we touch on all three of these issues. We first argue that the theories of functional languages and Applicative Universal Grammar (AUG) share a common underlying paradigm. By paradigm we mean a pattern of thought, a system of concepts and assumptions underlying a group of theories, regardless of how they differ from one another in detail. Some of the fundamental ideas of this common paradigm can be traced to the works of Haskell B. Curry on combinatory logic and its philosophical and linguistic implications. Indeed, our argument leads to the realization of a natural language parser written in the functional language Haskell, and based on the theory of AUG. The first complete description of AUG was published in 1965 [16], unifying the categorical calculus of Lesniewski [12] with the combinatory calculus of Curry and Feys [5]. The semantic theory of AUG was presented in [17], and its use in the translation of natural languages is given in [19]. A full description of the current state of AUG is given in [18]. A description of the feasibility of natural language parsing using AUG was first given in [11] as a literate Haskell [10] program. The functional language Haskell is named in the memory of Haskell B. Curry, the logician whose work on combinatory logic provides much of the foundation for both functional programming languages and AUG. Indeed, Curry himself was interested in the study of natural language and grammatical structure [4]. Related work on using a functional language for NLP may be found in [6], which differs from ours by being based on principles proposed by Montague [15]. 2 The Rise of Functional Programming Languages The earliest programming languages were developed with one goal in mind: to provide a vehicle through which one could control the behavior of computers. This goal seemed reasonable, but quickly ran into two problems: first, it became obvious that what was easy for a machine was not necessarily easy for a human being; and second, as the number of different kinds of machines increased, the need arose for a common language with which to program them all. Thus from raw object code and primitive assembly languages, a large number of high-level programming languages emerged, beginning with FORTRAN in the 1950’s. High-level languages are generally divided into two classes: imperative languages and declarative languages. Imperative languages are lower level—they say a lot about how to compute a result—whereas declarative languages are at a higher level—they say most about what should be computed, rather than how. Functional programming languages are declarative languages whose underlying model of computation is the function (in contrast to the relation that forms the basis for logic programming languages). (See [8] for a more thorough discussion of these issues.) The key point here is that higher-level languages are less dependent on the details of the underlying hardware. The earliest programming languages depended on the hardware completely, reflecting fairly accurately the structure of the underlying machine. The emergence of high-level imperative languages was a considerable step forward; still, it was only a partial liberation from the machine. Only through the emergence of declarative languages, in particular pure functional languages, did a fairly complete liberation take place. Indeed, in studying any language—a programming language, a language of mathematics or logic, or a natural language such as English or Russian—we must distinguish two very different kinds of structure: the physical structure and the functional structure of language. The physical structure of programming languages is a reflection of the physical structure of the computer. The physical structure of natural language is so-called phrase-structure, which reflects the action of the human machine; i.e. the organs of speech. Because of the dependence on the human organs of speech, natural language has a linear phrase-structure. However, the essential structure of natural language, its functional structure, is independent of the linear phrase-structure. The parallel between programming languages and natural languages with respect to the distinction of the physical structure and functional structure is clear. Does this distinction apply to the languages of mathematics and logic? We find a partial answer to this question in the illuminating studies of Haskell B. Curry, who distinguished two kinds of structure: the structure of concatenative systems in standard formalizations of logic and the applicative structure of ob-systems in combinatory logic. In Curry’s terminology, a concatenative system is a concrete system depending on physical properties of signs; that is, on their properties represented as linear strings. The concatenation operation is an operation of combining signs in their linear order. Here’s how Curry contrasts the concatenative structure and the applicative structure: “In the first place, combinatory logic was formulated as a system in which the formal objects were rather differently conceived than was the case in the standard formalization of logic. The standard procedure at that time was to demand that the formal objects be expressions of some “object language;” this means that they be strings formed from the symbols of that object language by concatenation. In combinatory logic these formal objects, called obs, were wholly unspecified. It was merely postulated that there was a binary operation of application among them, that the obs be constructed from the primitive objects, called atoms, by these operations, and that the construction of an ob be unique. This means that the obs were thought of not as strings of atoms, but as structures like a genealogical tree. [4, pp. 64–65] From this discussion we adopt the premise that the applicative structure is the essential structure of any language—a programming language, a language of mathematics or logic, or a natural language. 3 The Rise of Applicative Universal Grammar Applicative Universal Grammar (AUG) develops the ideas of Ferdinand de Saussure about language as a system of signs. It is based on the observation that some fundamental ideas of linguistics and combinatory logic converge. Thus AUG consists of a formal model of natural languages, called genotype, that is based on combinatory logic. The basic concept of AUG is the applicative structure, which contrasts sharply with the concept of phrase structure, the standard formal system in many linguistic theories. The essential properties of the applicative structure of AUG are characterized by the following principles: Principle 1 The Transfer Principle. A grammatical structure or class is independent of its symbolic representation, and so may be transferred from one symbolic device into another without changing its meaning. For example, subject-object relations can be represented by case markers as in Russian or Latin or by word order as in English. In a Russian sentence consisting of three words—subject, predicate, and direct object—the words can be permuted in six different ways without changing the meaning of the sentence. Grammatical relations are invariant of their symbolic representations. Principle 2 The Genotype-Phenotype Principle. Grammatical structures and classes, independent of their symbolic representation, form a universal system of grammatical categories of natural languages, called a genotype. Symbolic representations of grammatical structures and classes of specific natural languages form concrete systems called phenotypes. This principle is a corollary of the Transfer Principle, and further constrains grammar by imposing the genotype and phenotype levels on it. It calls for a revision of linguistic theories which confound these levels. In accordance with the Genotype-Phenotype Principle, we must clearly distinguish rules of grammar from laws of grammar. Rules of grammar are languagespecific; they are part of phenotypes. Laws of grammar are universals underlying rules of grammar; they are part of the genotype. A law of grammar is an invariant over a class of particular rules of grammar. As an example, consider the active/passive distinction in English: Columbus discovered America. America was discovered by Columbus. Various formulations of the rule of passivization in English greatly differ in detail, but it is generally agreed that (1) passive in English applies to strings in which two noun phrases and a verb occur in this order: First Noun Phrase—Active Verb—Second Noun Phrase; and (2) passivization involves the postponing of the preverbal noun phrase and presposing of the postverbal noun phrase: Second Noun Phrase—Passive Verb—First Noun Phrase. In other words, the standard rule of passivization in English is stated in terms of the word order. A rule of passivization such as this is clearly language specific. Since different languages have different word order—for example, in Malagasy, the direct object precedes the verb and the subject follows the verb—there must be distinct language-specific rules of passivization for these other languages. Furthermore, in languages such as Russian or Latin where word order does not have grammatical meaning, the rule of passivization is stated in terms of case endings. From the perspective of the Transfer Principle, we face an empirical question: is there an invariant of the particular rules of passivization in particular languages? We believe that there is: a universal law that we call the Law of Passivization. This law is roughly characterized by two invariant processes: (1) a conversion of the predicate relation, and (2) a superposition of the second term of the converted predicate with an oblique term. This law will become clearer later, although its complete formulation is beyond the scope of this paper, and may be found in [18]. We claim that the Law of Passivization—and other laws as well—are universal. But in fact, in some cases a law is simply not applicable to certain languages, and the Law of Passivization is no exception: the active/passive correspondance does not exist in some languages. Our use of the term “universal” is from the perspective of treating the theory of grammar as a highly abstract branch of linguistics whose goal is to establish a single hypothetical grammatical system from which all possible structures and entities may be deduced. Universal laws hold for this abstract system, but concrete languages are in many cases subsets of the whole system, and thus certain laws may not be applicable. This should not deter us from seeking universal laws, for two important reasons. First, the universal laws allow us to recognize common characteristics of otherwise very different languages. Second, they allow us to imagine, theoretically at least, what the instantiation of particular laws might look like in languages for which they are currently not applicable. Principle 3 The Principle of Semiotic Relevance. The only distinctions that are semiotically relevant are those that correlate with the distinctions between their signs, and, vice versa, the only distinctions between signs that are relevant are those that correlate with the distinctions between their meanings. One can never overstate the significance of the Principle of Semiotic Relevance. If one wants to present de Saussure’s doctrine in a single theoretical statement, the Principle of Semiotic Relevance is it. This principle defines the essence of linguistic reality, and is a keystone of the semiotic study of language. The Principle of Semiotic Relevance is a powerful constraint on the theory of grammar, and not all linguistic theories conform well with this principle. For example, generative phonology [13, 3, 7] considers only the sound patterns of morphemes, completely disregarding their meanings. As a result, it wrongly identifies certain morphemes by posting fictitious relationships between them. The fundamental error of generative phonology is that is generates away cognate forms based entirely on formal criteria without regard to the meanings of the forms. For example, disregard of the meanings of the forms of morphemes leads to a confusion between synchrony and diachrony [18]. The opposite error is encountered in generative semantics, which fails to support distinctions in meanings with concomitant distinctions in phonic expressions. Consider, for instance, the famous McCawley’s analysis of ‘kill’ as a causative verb in English. In a bracketed notation this analysis reads: (cause (become (minus alive))); that is, “cause becomes minus alive,” which is meant to be a semantic componential analysis of the verb ‘kill’. This analysis is weak because it is based on the idea that given a possible paraphrase of the verb ‘kill’, it must therefore ipso facto be considered a causative verb. In accordance with the Principle of Semiotic Relevance, any difference between linguistic meanings must be correlated with differences between phonic expressions. Real causative verbs are characterized by appropriate phonic markers as in the forms sit : set (I sit by the table, I set the table,) and fall : fell (the tree falls, the lumberjack fells the tree). The verb ‘kill’ has none of these phonological markers of the causative meaning. Linguistic meaning is vital for communication, and is an essential aspect of every use of language; but the linguistic meaning does not constitute the total meaning of a sentence or word. Consider the sentence ‘Garry Kasparov and I.B.M.’s computer Deep Blue came to draw in the fourth game yesterday.’ The linguistic meaning of the sentence is determined by the dictionary and the rules of the grammar of English. But the sentence means more than that. A man who knows chess can infer from the context of the word ‘game’ that it was the game of chess. He may also infer that Kasparov and Deep Blue had played three games before the game yesterday. He may infer further that Deep Blue is a superstrong chess program because Kasparov is the world champion of chess. A man who does not know chess cannot infer from the meaning of the word ‘game’ that it was a chess game. From The sentence ‘John killed a bear’ we infer that John caused a bear not to be alive, but causation is an inferential meaning that is parasitic on the linguistic meaning of kill. The total meaning of a sentence or word is a compound containing the linguistic meaning combined with other kinds of meaning just as a chemical compound contains a certain substance combined with other substances. To isolate a certain substance from other substances, one uses chemical reagents. The analysis of meaning is mental chemistry. The chemical reagent of the linguist is the Principle of Semiotic Relevance. Using it, the linguist isolates the linguistic meaning in its pure form. Principle 4 The Unity of Syntactic and Semantic Representation. Syntactic and Semantic representation cannot be separated from each other; they constitute a unique representation, called contensive representation. This principle is a corollary of the Principle of Semiotic Relevance. It follows from this principle that any distinction in semantic representation must correlate with a distinction in syntactic representation, and vice versa: any distinction in syntactic representation must correlate with a distinction in semantic representation. A system of contensive representation is called contensive syntax. Contensive syntax is a new concept which should not be confused with semantics. The existence of semantic rules presupposes the existence of syntactic rules. In contrast, contensive syntax is a unitary system of rules which is, so to speak, a chemical bond of structure and meaning. Just as water is a completely new substance in comparison with hydrogen and oxygen taken separately, so contensive syntax is a completely new entity in comparison with structure and meaning, which taken separately are not part of linguistics; they are part of logic. The fundamental constraint on the combination of signs is the Sign Combination Principle. Principle 5 The Sign Combination Principle. A sign, called an operator, combines with one or more signs, called its operands, to form a new sign, called its resultant, on condition that its meaning is incomplete and needs to be supplemented by meanings of other signs. For example, verbs and adjectives are operators with respect to nouns because meanings of verbs and adjectives are incomplete and are in need of supplementation by meanings of nouns. Consider ‘boy’ or ‘paper.’ The meanings of these nouns are complete. Take now ‘writes’ and ‘white.’ We ask: Who writes? What is white? The meanings of the words are incomplete. They need to be supplemented by meanings of nouns such as ‘boy’ or ‘paper’. In ‘the boy writes’ the verb ‘writes’ is an operator and ‘the boy’ is its operand; in ‘white paper’ the adjective ‘white’ is an operator and ‘paper’ is its operand. Similarly, the meaning of prepositions is incomplete without supplementation by meaning of nouns; therefore prepositions are operators with respect to nouns. In ‘on the table,’ ‘on’ is an operator and ‘the table,’ its operand. Furthermore, the meaning of a conjunction is incomplete, and needs to be supplemented by the meaning of words belonging to basic word categories—nouns, adjectives, verbs, adverbs, or complete sentences. Therefore a conjunction is an operator with respect to expressions of all these categories: in ‘black and white,’ ‘and’ is an operator with respect to ‘black’ and ‘white.’ In a later section we will see more elaborate examples of the Sign Combination Principle, including chains and hierarchies of meaning supplementations. Principle 6 The Principle of Monotonic Constructions. Any combination of linguistic units has a unique construction; in algebraic terms, any combination of linguistic units is non-associative. Being a corollary of the Sign Combination Principle, the Principle of Monotonic Constructions is of paramount importance for linguistics. It excludes systems such as generalized categorial grammar [14], whose associativity means that a sentence can be bracketed in every possible way. Moorgat motivates the use of the associative calculus as follows: The application analysis for ‘John loves Mary’ is strongly equivalent to the conventional phrase-structure representation for a sequence subject— transitive verb—direct object, with the transitive verb and the direct object grouped into a VP constituent. Suppose now that we are not so much interested in constituent structure, as commonly understood, but rather in the notion of derivability, that is, in the question: Given a sequence of input types (viewed as sets of expressions), what type(s) can be derived from the concatenation of the input sentences? It will be clear that the result type S would also be derivable in the transitive verb had been assigned the type NP \ (S/NP ) instead of (NP \ S)/NP [14, p. 148]. Associativity considerably simplifies the construction of mathematical models of language, but it distorts linguistic reality. Associativity is motivated primarily by convenience: an associative calculus is much more convenient for parsing a string of words in a purely mechanical fashion. The trouble is, as follows from the Sign Combination Principle, the sentences of a natural language have a nonassociative structure. Thus if we want to understand their structure, we have no choice but to construct a non-associate calculus. Principle 7 The Principle of Type Assignment. Every sign of the applicative system is assigned a type which defines its syntactic function. The Principle of Type Assignment is subject to the following conditions: 1. Inclusion. Every atomic sign is assigned a characteristic type. 2. Exclusion. No sign belongs to more than one characteristic type. 3. Superposition. Every sign can be assigned a complementary type superposed on its characteristic type. 4. Polymorphism. Some signs can be assigned variable types. The range of a variable type includes concrete types having related functions. Principle 8 The Superposition Principle. If in a given context C a unit A takes on the function of the unit B as its complementary function, a syncretic unit 〈A/B〉 is formed. We say that A and B are superposed in the syncretic unit 〈A/B〉, and we call the operation of forming 〈A/B〉 the superposition of A with B. Given 〈A/B〉, A is called the basis, and B the overlay. Superposed types are governed by the following principles: 1. Existence. The unit 〈x/y〉 exists in a given context C if the unit x is superposed with the unit y. 2. Identity. A superposed unit is distinct from its basis. Two superposed types are the same only if their bases and overlays are the same. 3. Inheritance. In any context C in which a superposed unit exists, it has those normal properties possessed by its basis. Let us turn to examples of superposition. Consider the word ‘lion’. The characteristic meaning of ‘lion’ is the name of an animal. But in combination with some words it takes on the meaning ‘a famous and important person,’ as in a ‘literary lion’. The characteristic meaning of black is ‘of the darkest color,’ but the word may take on the meaning ‘very bad,’ as in ‘black despair’. These are examples of the action of modifying contexts. A modifying context changes the meaning of the word with which it comes: the word becomes synonymous with some other word or expression. In our examples, the word ‘lion’ is synonymous with ‘a famous and important person’ in the context of literary; and the word ‘black’ is synonymous with ‘very bad’ in the context of despair. Due to the action of its modifying context the meaning of a word becomes figurative, representing a syncretism of two meanings: the initial meaning of the word and the meaning of the expression with which the word is synonymous. This is a case of polysemy. Due to the action of the modifying context the word becomes polysemous. Nouns and adjectives seem to behave in a similar way: in some contexts they fulfill the role of the argument of a predicate, in other contexts, the role of an attribute of a noun. If we classify nouns and adjectives as polymorphic, then we must admit that their polymorphism is identical and that nouns and adjectives are identical at the level of their phrasal projection. But the resulting type ambiguity of lexical classes would then conflict with the generally accepted notion of lexical classes as morphologically and syntactically distinct entities. In search of a plausible explanation, we arrive at the hypothesis of a hierarchy of syntactic types assigned to each lexical class. It is this hierarchy that is explained by the Superposition Principle. This analysis reveals the opposition between the noun and the adjective: the characteristic type of the noun is the complementary type of the adjective, and, conversely, the characteristic type of the adjective is the complementary type of the noun. A sign with a complementary type superposed on its characteristic type displays duality: it takes on the properties of the complementary type superposed on its characteristic type but retains at least part of properties of its characteristic type. 4 Parsing Natural Language Based on AUG To understand the way in which parsing using AUG works, it is useful to think of words and phrases as atoms and expressions, respectively, in a typed language of combinators. For our simplified version of AUG, there are just two primitive types: T representing terms (for example, nouns such as ‘friend’ and noun phrases such as ‘my friend’), and S representing complete sentences (such as ‘my friend runs’). The only non-primitive type is of the form Oxy, denoting phrases that transform phrases of type x to modified phrases of type y; this is the most important concept behind the AUG formalism. For example, the word ‘my’ is treated as having type OTT since it is applied to a term of type T to obtain a modified term, also of type T (every word is pre-assigned one or more types in this way). Thus the construction of the noun phrase ‘my friend’ can be described by an inference: ‘my’ :: OTT ‘friend’ :: T
منابع مشابه
Pragmatic expressions in cross-linguistic perspective
This paper focuses on some pragmatic expressions that are characteristic of informal spoken English, their possible equivalents in some other languages, and their use by EFL learners from different backgrounds. These expressions, called general extenders (e.g. and stuff, or something), are shown to be different from discourse markers and to exhibit variation in form, funct...
متن کاملThe Interface between Linguistic and Pragmatic Competence: The Case of Disagreement, Scolding, Requests, and Complaints
Second language learners often develop grammatical competence in the absence of concomitant pragmatic competence (Kasper & Roever, 2005) and the exact nature of the relationship between the two competences is still indistinct and in need of inquiries ( Bardovi-Harlig, 1999; Khatib & Ahmadisafa, 2011). This study is a partial attempt to address the lacuna and aims to see if any relationship ca...
متن کاملInvestigating Non-Native English Speaking Graduate Students’ Pragmatic Development in Requestive Emails
The present study investigated learners’ interlanguage pragmatic development through analysis of 99 requestive emails addressed to a faculty member over a period of up to two years. Most previous studies mainly investigated how non-native English speaking students’ (NNESs) pragmalinguistic and sociopragmatic competence differed from native English speaking students (NESs) and compared learners ...
متن کاملPerceived Pragmatic Transferability of Persian L1 Refusal Strategies
Investigating transfer of one’s pragmatic knowledge of first language to his second or foreign language has been one of the areas of interest for researchers. However, there are contradictory results even within the scarce studies which have addressed transferability. This study was an attempt to investigate perceived pragmatic transferability of L1 refusal strategies by Persian EFL learners. S...
متن کاملPragmatic Competence Development: Demystifying the Impact of Task Types and EFL Students' Perceptions
Interlanguage pragmatics, as an inseparable part of communicative competence, has been emphasized as an ultimate objective in language learning. This study explored the perceptions of Iranian English as a foreign language (EFL) students regarding interlanguage pragmatics and the impact of textbooks tasks on shaping their pragmatic competence. To accomplish this objective, 137 senior EFL student...
متن کاملPragmatic Representations in Iranian High School English Textbooks
Owing to the growing interest in communicative, cultural and pragmatic aspects of second language learning in recent years, the present study tried to investigate representations of pragmatic aspects of English as a foreign language in Iranian high school textbooks. Using Halliday’s (1978), and Searle’s (1976) models, different language functions and speech acts were specifically determined and...
متن کامل